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I.  INTRODUCTION 


Predictive  source  encoding  with  distortion  is  considered,  for  an  analog  source,  in  the  presence  of  an 
outlier  model.  In  particular,  a  stationary  Gaussian  source  is  assumed,  and  observation  data  that  are  a 
mixture  of  source  data  and  outlier  data.  The  objective  then  is  to  design  a  sequence  of  predictive  source 
encoders  which  attain  satisfactory  mean  difference-squared  distortion  in  both  the  presence  and  the 
absence  of  outlier  data,  subject  to  an  output  entropy  constraint.  As  compared  to  the  optimal  at  the 
Gaussian  source  sequence  of  predictive  encoders,  the  tradeoff  is  increased  mean  difference-squared 


distortion  and  differential  output  entropy  at  the  nominal  Gaussian  source,  at  the  gain  of  good  mean 
distortion  performance  in  the  presence  of  outliers,  (for  parametric  source  encoding  studies,  see  [1]). 


II.  PRELIMINARIES 


Let  [p0X,R]  be  a  discrete-time,  stationary  and  zero  mean  real  source,  where  R  denotes  the  real  line, 
where  X  is  the  name  of  the  source,  and  where  pc  is  its  measure.  Let  Xj(  i=l,2,...,  denote  random  variables 

generated  by  the  source,  let  x;,  i=l,2...,  denote  realizations  of  those  variables,  and  let  Xj  ^ [X, . Xj]1 

and  xj  ^  [xi,...,xj]T,  forj  >  i.  Let  Rn  denote  n  one-sided  multiples  of  the  real  line.  Let  the  measure  Po  be 
known,  and  let  us  then  call  [po.X.R]  the  nominal  source. 

We  now  consider  the  outlier  model.  Then,  if  Ip,  Y,R]  denotes  the  observation  process,  if  Y,  denotes 
the  i-th  random  variable  generated  by  this  process  with  yt  denoting  its  realization,  and  if  Y{  and  yj  denote 
vectors  as  in  the  above  paragraph,  we  have: 

Yj  =  (l-Vj)Xj  +  VjZ,  .  i=l,2,...  (1) 

;  where  Xj  is  the  i-th  random  variable  generated  by  the  nominal  source,  where  (Z,  1  is  a  sequence  of 
random  variables  whose  measure  is  unknown,  and  where  the  variables  (V, }  arc  i.i.d.  and  binary,  with: 

P(Vj  =0)  =  1-e  ,  P(V,  =  1)  =  e  (2) 

for  some  e  such  that  0<£<1.  The  sequence  {V,  ]  determines  the  contamination  law,  and  the  sequence  (Z,  ] 
corresponds  to  the  contaminating  process,  which  is  not  necessarily  stationary.  If  e  =  (),  then  the 
observation  process  is  identical  to  the  nominal  source  [p^.X.R], 

We  will  assume  that  the  nominal  source  and  the  sequence  { Z, )  arc  both  absolutely  continuous.  We 
then  denote  by  the  m-dimcnsional  density  function  induced  by  the  nominal  source  at  the  vector 

point  y™.  We  denote  by/^fyf1)  the  m-dimcnsional  density  function  of  the  random  vector  Y™  at  the 
vector  point  y™,  where  Y,  is  as  in  (I)  and  V,  is  as  in  (2).  Let  us  define  the  following  class  of  m- 
dimcnsional  density  funcitons: 

Fm  _  ym  .  jm  =(,_£)•"/£  +  [  1 -e )m  ]hm  ,  (3) 
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where  hm  is  any  m-dimensional  density  function} 

It  can  be  easily  seen  then  that/^eF™.  That  is,  F™  is  an  enlargement  of  the  class  of  m-dimcnsional 
densities  that  are  generated  by  the  outlier  model  in  (1)  and  (2).  An  alternative  form  of  the  class  F™  is  as 
follows: 

Ff  =  [fn  :/"(yi1)-a-8)^’(y,r)>0  ;  YyfeRm 

}/"(yi’)dyf  =  i)  (4) 

R” 

;  where 

5^  1— <1— e)m  :  0<  5<  1  (5) 

Let  CE  denote  the  class  of  observation  processes  generated  by  the  outlier  model  in  (1)  and  (2),  and 
let  us  signify  a  process  [p,Y,R]  by  its  measure  p.  Then,  peCE,  means  that  the  process  [p,Y,R]  is 
contained  in  class  Ce,  and  clearly  p0eCE,  where  pQ  is  the  nominal  source  [p0,X,R]. 

We  consider  predictive  source  coding  with  distortion  for  the  nominal  source  pOI  when  the 
observation  process  p  belongs  to  the  class  Ct.  In  particular,  for  every  given  infinite  observation  sequence 
y” ,  we  wish  to  design  a  sequence  (vmy»  }m>i  of  generally  stochastic  operations,  such  that  vm  y~  maps  the 
datum  xm+1  of  the  nominal  source  p0.  Let  us  denote  by  {vm  }m2l,  the  sequence  of  the  above  operations 
when  the  infinite  observation  sequence  y"  varies  in  R”.  Let  us  denote  by  pjv  )  the  process  induced  by 
{vm)m2i  when  the  observation  sequences  are  generated  by  the  process  p,  where  p£CE.  Then,  we  arc 
looking  for  sequences  {vm  }m>] ,  which  satisfy  the  following  properties: 

(a)  For  every  p  in  CE,  the  entropy  H(p(vj)  of  the  process  p(vj  is  bounded  from  above  by  a  given  finite 
number. 

(b)  There  exists  some  constant  D  <  (X2 },  such  that  for  every  p  in  Cf ,  the  difference-squared  mean 

distortion  induced  by  the  sequence  (vm}m>,  is  bounded  from  above  by  D.  That  is,  if  for  given 


peCe,  Zk+i  denotes  the  (k+l)-th  random  element  from  the  process  P{V„),  then, 


E^,  {(Xk+1  -  Zk+i  )2 }  <  D  ;  Vk,  VpeQ  (6) 

;  where  Xk+i  is  generated  by  the  nominal  source  pQ. 

(c)  The  sequence  {vm)m£1  induces  entropy  and  difference-squared  mean  distortion  continuities  at  the 
nominal  source  p<,.  That  is,  given  q  >  0,  where  exists  y>  0,  such  that  if  p  is  a  process  y-close  to  pQ 
in  an  appropriate  measure,  then 


IH(p0,{0)-H(p{Vio))l  <q  (7) 

IEK(vml  {(Xk+1  -  Z^)2}  -  {(Xk+1  -  Wk+1)2}  I  <  q  ;  -Vk  (8) 

;  where  in  (8),  Xk+i  is  generated  by  Po,  Z^,  is  generated  by  P{V„).  and  Wk+1  is  generated  by  P{Vj- 
Property  (c)  corresponds  to  qualitative  robustness,  see  ([2],[3],[4],[5]),  where  the  appropriate 
measure  of  closeness  between  the  processes  p,,  and  p  is  the  Prohorov  distance  with  an  empirical  Prohorov 
metric,  (see  [4], [5]).  If  property  (c)  is  satisfied,  then  the  sequence  {vm  }m^i  >s  called  qualitatively  robust 
at  Pa.  From  the  results  in  [4]  and  [6],  we  conclude  that  {vm}m2i  is  qualitatively  robust  at  p0  within  the 
class  of  stationary  processes  p,  if  it  satisfies  the  following  sufficient  continuity  conditions,  where  nY) 

A  A  1 

denotes  Prohorov  distance  with  metric  Yi  (x,y)  Z.  I  x-y  I ,  and  where  y i(x\ ,  y  j  X 1  xi— Yi 1  • 

i-l 

(A)  Pointwise  continuity.  That  is,  given  finite  m,  given  ri>o,  given  \  there  exists  8  >  o,  such  that 
y™  :  Ym(xT'.yT’)  <  5  implies  nYl(vm<xr,vmiy-)  <  q. 

(B)  Asvmntotic  continuity  at  pQ.  That  is,  given  £  >  0,  q  >  0,  there  exist  integers  rio  and  l,  some  8  >  0, 
and  for  each  n  >  n„  some  AneRn  with  p0(An)  >  1-q,  such  that  for  each  xneAn  and  yn  such  that 

inf  {a  :  #[i:Y/(x!+'"1,  y|+M)  >  a|  <  na  )  <  8,  it  is  implied  that  nYi(vn  x.,vn y)  <  t,. 

We  point  out  that  if  for  each  given  xn  and  each  n,  the  operation  vn  V>  is  deterministic,  then  the 
Prohorov  distance  nYi(vn  x«,vn  y.)  reduces  to  I  vn  „■  -  vn  y»  1 . 


From  now  on,  we  will  assume  that  the  nominal  source  is  Gaussian,  zero  mean,  and  stationary,  with 
given  spectral  density.  In  section  III,  we  will  outline  the  parametric  version  of  our  approach,  when  the 
observation  process  is  known  and  predictive  source  encoding  is  sought.  In  section  IV,  we  will  design 
predictive  encoding  operations  for  finite  dimensionalities  of  the  observation  sequences.  In  the  same 
section,  we  will  also  study  the  performance  of  those  operations,  both  at  the  nominal  source  and  in  the 
presence  of  contaminating  processes.  In  section  V,  we  will  consider  extensions  of  the  operations  found  in 
section  IV,  for  asymptotically  long  observation  sequences.  In  the  same  section,  we  will  also  study 
performance  issues  of  those  extensions.  In  section  VI,  we  draw  some  conclusions. 


III.  THE  PARAMETRIC  APPROACH 


In  this  section,  we  consider  the  case  where  the  nominal  and  the  observation  processes  are  both 
known  and  mutually  dependent,  and  predictive  encoding  is  sought,  for  entropy  reduction.  We  will  denote 
the  nominal  and  the  observation  processes,  Po  and  p,  respectively,  and  we  will  assume  that  they  arc 
absolutely  continuous.  We  will  then  denote  by  /JJ’Cyf)  the  m-dimcnsional  density  function  of  the 
observation  process,  at  the  vector  point  yf .  We  will  denote  by  /^(x  I  yf)  the  conditional  density  at  the 
point  x  of  the  datum  Xm+)  from  the  nominal  process  pot  given  the  observation  vector  yf  from  the 
observation  process  p.  We  will  also  adopt  the  difference-squared  distortion  criterion. 

Given  the  above,  let  us  initially  assume  that  no  entropy  reduction  is  sought.  Then,  as  well  known, 
the  sequence  (vm}mal  of  mappings  that  minimize  mean  distortion  arc  deterministic  and  given  by 
conditional  expectations.  That  is,  given  m  and  yf,  we  have 

vm.y1n>  =  {^m+1  I  y  1  }  =  j  x/Jl,n(x  I )/ 1  )dx_  mn„(i(y  1  )  (9) 

R 

and  for  Zk+i  denoting  the  (k-t-l)-th  clement  from  the  process  P(Vj,  the  induced  by  the  operations  in  (9) 
mean  distortion  is: 

em(Po*M-)  _  ^nlvmt  {(Xk+i  —  Zk+i)2 }  =  E^  {Xk+| }  —  J/JT(yf  Jrnj^Cyf  )dyf  (io) 

R" 

Let  us  now  assume  that  in  upper  bound,  log  M,  on  the  entropy  of  the  process  pjvj  is  given.  Then, 
we  design  a  sequence  (vm  }m>i  of  stochastic  mappings,  as  follows: 

Step  1 

We  select  a  set  (Aj,  l<i<M  }  of  intervals  on  the  real  line  with  AjfyVi  -  0  ;  Vi*j,  A,  =  R,  and 

l<i<M 

J/n,(x)dx  =  M_1 ,  where /^(x)  is  the  one-dimensional  density  of  the  process  p0,  at  the  point  x. 

A, 
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Using  the  set  (Ait  i<i<M)  of  Step  1,  we  design  the  sequence  {vm  }m>j  of  stochastic  mappings  so 


_ 


that,  given  m  and  y? .  the  mapping  vmyim  is  a  stochastic  channel,  mapping  the  sequence  y  f  onto  a  set 
{vj,  l<i<M)  of  scalar  real  values;  it  maps  yf  onto  v,,  with  probability: 

Pi^CyD  ^  J/n„n(x|y7,)dx  (11) 

Ai 

The  set  { v£  ;  l<i<M}  is  selected  to  minimize  the  mean  difference-squared  distortion.  That  is, 

a  .  m  , 

Dm.^CtVi})^  J  dy7'/H(yY')  X  Pi.n^nCy?)  J(x-Vj)2/Moti  (Xly,’)dx  = 


{J?L)Dm^({a,1) 


Then,  it  is  easily  found  that. 


v,=  J  dyf’/H(yT)p,^,H(y?1)  f  dyf/^yDp^.^ynm^Cyy1) 


Dm,M,,n({vi))-  (X^+i }  -  X  | dyf/nCy^lPi^.ixCy i 

i=l  R" 


•  iJdyf/M(yf)pi^41(yf)m(iofI(yf  j  > 

^cm(p0,p);  Vm  (14) 

;  where  CmCPo.p)  is  as  in  (10)  and  wherc  m^(yf)  is  the  conditional  expectation  in  (9).  Due  to 
(14),  we  conclude  that  the  stochastic  mappings  in  Step  2  induce  higher  mean  difference-squared 
distortion  than  that  induced  by  the  conditional  expectations  in  (9),  lor  the  gain  of  reduced  output  entropy. 
As  the  number  M  increases  to  asymptotically  large  values,  the  mean  distortion  D,n  (v,|)  approaches 
cm(Po.p).  and  the  output  entropy  increases  to  the  entropy  of  the  nominal  process. 


j>  V„V >  v^'vV'VVVA<V/V^\>\>VV  .'  V'v\-  -s  .  -  v 


!WW 


f  A  •V*’»  •  »H \T  •,  •,*  * 


•  •*» 


Let  the  nominal  process  p0  be  zero  mean  and  stationary  Gaussian  with  variance  per  datum  tq,  and 
let  the  observation  process  p  be  pQ.  Let  then  denote  the  mean-squared  error  induced  by  the  optimal  at 
p„  mean-squared  one-step  predictor,  when  the  size  of  the  observation  vector  is  m.  Let  the  interval  Ax  in 
(1 1)  be  (aj.bi),  where  bj>aj.  Then,  we  easily  find  that  the  expressions  in  03)  and  (14)  take  the  following 
form,  where  q>(x)  and  <f>(x)  denote  respectively  the  density  function  and  the  distribution  of  the  zero  mean 
and  unit  variance  Gaussian  random  variable,  at  the  point  x: 
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IV.  FINITE  DIMENSIONALITY  OBSERVATION  SEQUENCES 

In  this  section,  we  consider  the  outlier  model,  as  exhibited  by  the  observation  process  in  (1)  and  (2), 
and  we  assume  that  the  nominal  process  is  stationary  zero  mean  Guassian.  We  then  wish  to  design 
predictive  encoding  operations  vm,  for  l<m</,  where  /  is  some  given  finite  integer.  We  want  the  designed 
operations  to  satisfy  properties  (a),  (b),  and  (c)  in  section  II.  For  given  finite  /,  we  adopt  a  saddle-point 
game  theoretic  approach,  based  on  the  parametric  scheme  in  section  III.  We  first  assume  that  the 
processes  in  the  class  C£  in  (1)  and  (2)  are  all  absolutely  continuous,  and  we  denote  by/^fy™)  the  tri¬ 
dimensional  density  function  of  the  nominal  Gaussian  process  po.  at  the  vector  point  yf .  Then,  given  /, 
we  consider  an  enlargement,  Fg,  of  the  class  of  /-dimensional  densities  generated  by  the  model  in  (1)  and 
(2),  as  that  in  (4).  In  particular,  we  consider  /-dimensional  densities,/,  of  the  observation  process,  such 
that/eFg,  where: 

^8  =  {/  : /(y t1) - (l-5)/o(y 7) > 0  ;  Vy?eRm, 

[/(yDdyf  =  i }  G7) 

R' 

5^  l-O-e)'  :  0<5<1  (18) 

Let  an  upper  bound,  log  M,  on  the  output  entropy  be  given.  Then,  we  wish  to  design  predictive 
encoding  operations  which  satisfy  this  bound  for  every  process  in  class  Fg,  and  which  induce  mean 
difference-squared  distortion  that  is  upper  bounded  by  a  given  bound,  for  every /eFg.  Our  approach 
evolves  from  the  parametric  scheme  in  section  III,  and  goes  as  follows: 

Step  1 

Select  a  set  (4,,  I<i<M)  of  intervals  on  the  real  line  with  .-Vf-ylj  =  0  =  R.  and 

ISisM 

j/oWdx  =  M~\  where  is  the  one-dimensional  density  of  the  Gaussian  nominal  process  p„,  at  the 

A, 

point  x. 
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Using  the  set  {/4,,  i<i<M}  in  Step  1.  and  given  a  process  p  whose  density  function  belongs  to  the 
class  Fg,  we  form  the  set  {p,  l<i<M  j  of  probabilities  as  follows. 


Given  y™  in  Rm  :  Pi^(yf )  Z.  J/^OtlyDdx.  l<i<M  (]9) 

A 

Let  denote  the  set  of  sets  (a;  ;  l<i<M}  of  M  real  numbers.  We  then  consider  the  following  class,  D, 
of  mappings  v,  =  v/(p,  {a,}),  that  is  generated  by  varying  p  in  Fg  and  {a, }  in  Ns,\- 
Given  p  in  Fg  and  {aj  in  N m,  given  observation  sequence  y™,  v,  y-  maps  the  sequence  y™  onto  the 
value  ait  with  probability  p^fy™),  as  in  (19).  Given  { aj }  in  AfM,  given  pj  and  P2  in  Fg,  let 
D/(pi,p2,{ai))  denote  the  mean  difference-squared  distortion  induced  by  the  operation  v,(p2,(ai})  in  D, 
at  the  observation  process  Pi .  Then, 


D/(p1,p2,{ai))=  J  dy™/^ (y™)  23  Pi.n,(yi’)j(x-ai)Vm>.(i1  (xlyDdx 

R"  i=l  R 

We  arc  then  searching  for  a  triple  (pi  *,p2*,  (vj }),  such  that  Pi  *eFg,  p2*eFg,  (v,  )eNm,  and: 


Vpt.^eFi;  D/(p1,p2*,{vi))<D/(p1*,p2*.{vi})<D/(p1*,p2*,{a1));¥{a1}e^M  (21) 

Then,  we  select  the  vt*  =  v/(p2*.  {v( })  encoding  scheme  for  the  class  Fg. 

Remark  If  an  encoding  scheme  vt*  =  v/(p2*,  (v;})  in  D  exists,  such  that  it  satisfies  (21),  then  it  is 

guaranteed  that  the  maximum  mean  difference-squared  distortion  that  it  induces  in  Fg  is 

sup  Dj(p,p2*,  {vj}),  subject  to  the  existence  of  the  latter  supremum.  By  construction,  the  mapping  v,* 
V*Jr\ 

also  attains  maximum  entropy  in  Fg  that  is  bounded  from  above  by  log  M. 

Let /0(xly7)  denote  the  conditional  density  of  the  Gaussian  nominal  process  for  the  datum  Xm+)  at 
the  point  x,  given  the  past  sequence  y 7*  from  the  same  process.  Let  Z,(y'\)  denote  the  m-dimcnsional 
density  of  the  Gaussian  nominal  process  at  the  vector  point  y™,  and  let  Qm  be  the  m-dimcnsional 
autocovariancc  matrix  of  the  process.  Let  us  also  then  define: 
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a* 


(22) 


m0(yn  =  J/o(xlyndx 


PdCyni  J/oCxiyDdx 

At 


We  then  express  a  theorem  whose  proof  is  in  the  Appendix. 


Theorem  1 


Given  the  class  Fg  in  (17),  and  for  every  5 : 0  <  5  <  1 ,  the  game  in  (21)  has  a  solution 
(Pi*,  p2*.  { Vi }).  If/j*(yi),  j=l,2  denotes  the  /-dimensional  density  function  of  the  pocess  pj*,  j=l,2  at 
the  vector  point  y\ ,  then  this  solution  is  as  follows: 


/*(yi) ^/i*(y{)  =h *(yi )  =  0-6)/o(y'i ).max(l  .Xf1  {(yi)T<2r'  y[)m) 


;  where,  X,  :  |/*(yi)dyi  =  l 


and  for. 


q,(y't)  ^  M_1  [ l-min(l  ,X;  {(yi  )TQJ]  yi  }"1/2)1  + 


+  Poi(y i )min(l {(y i )tQTi y i  }  m) 


Vj  =  M(i-S)  Jdy',  /o(yi)m0(yi)qi(yi) 

R' 


Then, 


Vpe4  ;  D,(p,P2*.{Vi})<D/(p2*.P2*.{Vi})lD/tmax  = 


=  E^{X2}  -G-8)2M  £  J  dy,,/0(y,,)m0(y'1)qi(y,1 

i=l  R' 


The  encoding  scheme  v,*  is  as  follows: 


Given  an  observation  sequence  y{,  v,*  maps  it  onto  v(  with  probability  q,^ ). 


B 


r 


IPS 


r 


V’  V*.  A*  CIV*  *W 


tcWfr.«n;n:n:n;rtrttT« 


Given  /  and  M,  an  encoding  scheme  v/  consists  of  a  set  {aj,  l<i<M)  of  values,  and  for  every 
observation  sequence  y\  a  set  (Pi (y t ).  l<i<M}  of  probabilities,  such  that  y(|  is  mapped  onto  aj  with 
probability  p.fy't).  Given  /,  given  some  encoding  scheme  v,.  given  an  absolutely  continuous  observation 
process  with  arbitrary  dimensionality  densities,  /,  let  D*( /.v,)  denote  the  mean  difference-squared 
distortion  induced  when  vt  is  deployed, /is  the  density  of  the  observation  process,  and  a  datum  from  the 
nominal  Gaussian  source  is  predictively  encoded.  Let  vf  denote  the  optimal  at  the  Gaussian  observation 
process  encoding  scheme.  That  is,  given  an  observation  sequence  y\ ,  vlQ  maps  yi  onto  Uj,  with 
probability  poi(yi),  where,  given  set  {At,  i<i<M),  poi(yi)  is  as  in  (22),  and  where  form^y',)  as  in  (22): 


Let  the  common  set  {/4it  l<i<M}  be  used  by  both  the  scheme  vf  and  the  scheme  v;*  in  Theorem  1, 

and  let  this  set  be  such  that  J/Q(x)dx  =  M-1 ;  Vi.  Let /0  denote  the  arbitrary  dimensionality  density  of  the 

A 

nominal  Gaussian  source,  and  let  m0(y,1)  and  poi(yi)  be  as  in  (22)  and  qi(y/1)  be  as  in  (24).  Then,  by 
substitution,  we  easily  obtain: 


D,(/o,vf)  =  Ek(X2)-MX  I  dy'1/0(y,1)m0(y,1)p01(yi 

i=l  R' 


D/(/o-  v/*) =  Ejxo  {X2}  -  (1-5)M  £  J  dy/1/0(y'1)m0(y,,)qi(y,1 

i=l  R' 


•  2-(l-5)M  J  dyi/0(y,1)q1(y,i)  (29) 

R' 

Let  ll  denote  the  /-dimensional  vector  whose  elements  are  all  equal  to  one.  Let  z  denote  some  scalar 
real  number,  and  let  us  then  consider  a  density/,  such  that, /Cy i )  =  (1-0/O(y'i)  +  C5(zl'),  where  £  given 
and  such  that  0<1^<1,  where /„  is  the  density  of  the  Gaussian  nominal  source,  and  where  6(0  denotes  delta 
function.  Given  /,  given  an  encoding  scheme  vh  let  0/(/’0,C-z-v/)  denote  the  mean  difference-squared 
distortion  induced  by  v,,  when  the  observation  density  is  such  that  f(y[)  =  (l-0/o<y/i)  +  CS(zl/)  and  a 


datum  from  the  Gaussian  nominal  source  is  predictively  encoded.  Then,  for  D i(f0,  v°)  as  in  (28)  and  for 
D,(/"0,  v,*)  as  in  (29),  we  obtain  by  substitution: 

D/(/o.C,z.v?)  -  D/C/o.v?)  ivtfo.t, z.vf)  = 

M  r  1  2r  1 

=  J  dyi/0(y/i)w0(yi)p01(yi)  l+Mpoi(zl')  (30) 

i=l  l  R'  J  L  J 

D/tro,;.z,V/*)  -  D ,(/■„. v/*)  *  V/C/o.C, z.v,*)  = 

M 

=  C(1-5)M  £[  I  dy^0(yi)m0(yi)qi(yi)]2[2-(l-8)M  Jdyi/0(yi )qi(y/1 )  +  M(l-5)qi(zl,)l  (31) 

i=l  R'  R' 

The  functions  in  (30)  and  (31)  represent  changes  in  mean  difference-squared  distortion,  when  the 
observation  process  shifts  from  the  one  corresponding  to  the  nominal  source  to  a  mixed  process,  which 
with  probability  (l-£)  is  the  nominal  source  and  which  generates  deterministic  z-amplitude  data  with 
probability  The  rates  of  those  changes  at  £  =  0  are  the  Influence  Functions,  li(f0,z,v°)  and  Wo.z.v,*), 
of  respectively  the  encoding  schemes  vf  and  v/*,  at  the  nominal  source  p.0  and  the  amplitude  value  z. 
That  is, 


os  A  dV,(f0,;, z,v?)  , 

Wo.  z,v?)  = - ^ - I  ;=o  = 


/  dy'i/0(y,i)m0(y/1)p0i(y,1)  [1  +  Mp^fzl')] 

i=l  R' 


Wo  AV|*)“ 


A  dV,(/o,<;,z.v/*) 


C=o  = 


2-(l-5)M  }dy,1/0(yi)q,(y/i)  +  M(l-5)q,(/l') 


(33) 


Given  /,  given  an  encoding  scheme  vh  given  the  nominal  density  /c,  given  z  and  £,  let  us  consider 


the  mean  difference-squared  distortion  D/(/'0,£,z,v;).  Let  us  allow  tlic  value  Izl  to  go  to  infinity,  and  let 
us  then  find  the  maximum  value  C,  for  which  v,)  <  (X2 }.  This  latter  value  is  the 
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Breakdown  Point  of  the  encoding  scheme  v/,  at  jj.0-  It  represents  the  highest  frequency  of  extreme 
amplitude,  (±~),  deterministic  outlier  values  that  the  encoding  scheme  can  tolerate,  before  it  becomes 
useless;  that  is,  before  the  observation  sequences  provide  no  information  about  the  source  data.  We  now 
express  a  lemma,  whose  proof  is  in  the  Appendix. 

Lemma  1 

Given  M,  consider  a  set  {Ait  l<i<M}  of  intervals  on  the  real  line  with  Ajp^A^O; 
•Yi*j,  kj  A,  =  R,  and  J/0(x)dx  =  M-1,  where  /0(x)  is  the  one-dimensional  density  of  the  Gaussian 

ISiSM  A, 

nominal  source.  Let  in  addition  A]  =  - a )  and  AM  =  (a,  °°)  for  a  >  0.  Let  v°  be  the  optimal  at  the 

Gaussian  process  encoding  scheme,  and  let  V/*  be  as  in  Theorem  1.  Then,  given  /,  the  breakdown  points 
and  £/*  of  the  schemes  v°  and  v/*,  respectively,  are  given  by  the  following  expressions. 


J?  =  {l+MPg,  -£P2i  }-> 

.  i=l  . 


M  M 

c,*  =  (1  +  0-5)  XQo.  £Qoi(2-o-5)M  JdyfofyiMyi)]  )-' 
.  i=l  }{  i=l  R'  J 

;  where,  for  wio(yi )  and  poi(y  i )  as  in  (22)  and  q;(y  i )  as  in  (24), 


(34) 

(35) 


Poi  t  S  dyi/o(y,i)^o(y,i)Poi(y'i) 

R' 


(36) 


Qoi  t  J  dyi/o(yi)'”o(y'i)q.(y/i)  07) 

R' 

Remarks  For  finite  dimensionalities  of  the  observation  sequence,  the  encoding  operation  v,*  clearly 
satisfies  the  pointwisc  continuity  property  (A)  in  section  II;  thus,  it  is  qualitatively  robust.  As  exhibited 
by  expression  (24)  in  Theorem  1,  for  {(y,i)r(?rly/i  )1/:  relatively  small,  the  operation  vt*  maps  sequences 
y(i  onto  the  set  of  values  in  (25),  using  the  optimal  at  the  Gaussian  nominal  source  conditional 
probabilities.  As  {(y,i)1  (2/-1  y;i  }1/2  increases,  however,  the  operation  v,*  uses  a  mixture  of  such 
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mapping  probabilities,  and  asymptotically,  (for  (y j  )T  Qt  1  y^  — »  <*>),  it  maps  the  sequences  yi ,  using  the 


unconditional  nominal  density  function /^(x).  Thus,  it  disregards  extreme  observation  values,  offering 


protection  to  data  outliers,  at  the  expense  of  reduced  mean  difference-squared  performance  at  the  nominal 


source. 


Asymptotic  Performance 


Let  us  assume  that  the  number  of  values  onto  which  observation  sequences  y^  arc  mapped  is 


asymptotically  large.  That  is,  M— We  are  then  interested  in  the  performance  of  the  encoding  schemes 


v°  and  v;*,  forgiven  finite  /.  From  expressions  (28),  (29),  (30),  (31),  (32),  and  (33),  and  taking  limits,  we 


Define,  the  scalars  A/  and  p/  as  follows: 


A / :  Jdyi/ofyiVofx  I  yi  )mQ(y\ )  =  A ,x/0(x) 


P/ =E  ^{[XM-m0(X\)}2) 


Then, 


/im  Di(f0,v°)  =  {X2 ) 


/im  [Dl(f0£,zy'ft-Dl(f0,vf))  =CA?{[l+p?K;E^(X2)  +  p(,n*(zl')) 

M— >•*> 


Define, 


qfx.y'O^ll-minfLXHfyi^e/'yi)  1/2)lfo(x)  + 


+  min(l,X/{(yi)T  QT 1  y'i  r1/2)/o(xlyi) 


Then, 


r  ]  - 

/im  D/(/;>,v/*)  =  FMo{X:) -2(1-5)  J dx/n'Cx)  j  dy', )m0{y\  )q(x.y^ 


i  mu 


Tmnr: 


(1-5)2  J  dx/^2(x)[|  dy',/„(y'i)q(x,y'i)]I  J  dyi/oty'Offay'Oqtx.y',)]2 

R  R'  R' 


>(1-A?)E^{X2} 


(42) 


/im  {D/(^0,C.z.v/*>-  D/(/o,  v,*))  = 
M->°° 


=  £2(1-5)  Jdx/o*(x)[J  dyi/0(y/,)m0(y,1)q(x,y,I)]2 


R' 


-  £(1-S)2  I  dx/^2(x)[  Jdy/i/0(y/i)q(x,y/1)HfdyzL/0(y,1)wi0(y,1)q(x,y,i)]2 


R' 


C(l-5)2  J  dx/^2(x)q(x,zl')[  j  dy/1/0(yi)^o(y,i)q(x,y,i)]2 


(43) 


From  the  above  expressions,  and  noting  that  /im  q(x,zl')  =/0(x),  we  also  find,  denoting  by  <2,  the 

1  zl  — t°° 


/x/  autocovariance  matrix  of  the  nominal  Gaussian  source: 


Define, 


c,  ^{(I,)T(2r1  I'  vm 


(44) 


Then, 


/im  I/(/o,z,v?)  =  A/ {[1+p/ ]E^ {X2 }  +z2p/m2(I/)) 


M 


(45) 


/im  I/(/o.z.V/*)  = 

M— »«> 


:(l-5){2+(l-5)[l-min(l,A./C/ Izl  ’)]}  Jdx/^(x)[  j  dy/i/0(y/1)m0(y/i)q(x,y/,)]2 

R  R1 

+  (l-5)2min(l I z I"1 )  J  dx^2(xyo(xlzI/)[  J  Ay\f0{y\)m0(y\)q(\,y\)]2 
-G-5)2  |dx/;2(x)(J  dy,1/0(y,i)q(x,y/,)][J  dytetyi^y'iWx.y',)]2 


R' 


R* 


(4b) 


/im  Q  =  0 

M-4«o 


(47) 


/im  (,i*  = 


1  +(1-5) 


|  dx/;!  (x)|  j  dy  i/0(y'|  )/«„( y{  )q(x.y/1  )12 
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J  dx[  J  dyi/0(y,,)m0(y/1)q(x,y'1)]2[2/;1(x)-(l-5)/;2(x)  J  dy'1/0(y,1)q(x,y,1)]  }  1  (48) 

R  R'  R' 


Let  us  define, 


max  I  m0(y  i )  I 
=  y'rty'i^Oi’y'i  =  J 


Then,  we  can  express  the  following  lemma,  whose  proof  is  in  the  Appendix. 


Lemma  2 


The  limit  influence  function  in  (46),  and  the  limit  breakdown  point  in  (48),  that  the  encoding 


operation  V/*  induces,  are  bounded  as  below: 


/im  W0.z.v,*)  <  (l-5)(3-5)4^m?  - 


■  (1-5)2  Jdx/;2(x)[  j  dyi/0(yi)lq(x  y\ )t  J dy'^fy',  )mD( y ,  )q(x,y',  )]2 

R'  R‘ 


1+5  M 


<  /im  (,,*  <(l+5)4X?m2{(l+5)4\2m2  +(1-5)  |  d\f0\\)[  \dy'lf0(y\)m0{y\)q{x,y\)}2 )  1  (51) 


Thus,  asymptotically,  (M-+°°),  the  optimal  at  the  nominal  source  encoding  operation  has 


breakdown  point  zero,  and  quadratic  influence  function.  On  the  other  hand,  the  operation  v,*  has  then 


uniformly  bounded  influence  function  and  strictly  positive  breakdown  point.  Remarks  As  compared  to 


the  optimal  at  the  nominal  source  operation  v°,  the  operation  v;*  is  asymptotically,  (M->«>),  superior  in 


terms  of  breakdown  point  and  influence  function  performances.  This  is  at  the  expense  of  mean 


difference-squared  distortion  and  differential  entropy  performances,  at  the  nominal  Gaussian  source. 


Indeed,  as  it  can  be  easily  seen,  asymptotically,  (M— >°°).  the  process  induced  by  v,*  and  the  Gaussian 


measure  |i0  has  higher  differential  entropy  than  the  process  induced  by  v}’  and  p„.  In  addition. 


'  -.'V-Va.V.V  V ■-/►V'V-V  -V 


if 


y)  -r  ,•  yLvIvLvI 


/im  D i(fotVi*)  >  /im  D and  from  (26)  we  conclude: 

M— >°o  M— 

/im  D/O^.V/*)  <  {X2}  -  (1-8)2  Jdx/^1  (x)[  J Ay\f0(y‘x )m0(y\  )q(x. y )]2; 

R  R< 

;W  s  (52) 

Given  /,  let  Fy  v/)  denote  the  differential  entropy  induced  asymptotically,  (M->°°),  by  the  encoding 
scheme  v;  at  the  observation  process  p.  Let  H^  ^z(v;)  denote  the  differential  entropy  induced 
asymptotically,  (M-»<»)  by  vh  when  the  observation  sequence  is  generated  by  the  nominal  source  p,,, 
with  probability  (1-Q,  and  it  consists  of  deterministic,  amplitude-z  data,  with  probability  C-  Let  p;  be  as 
in  (38),  and  let  us  define, 

r^H^{X2}  .  cl  *  pW 

g(y l )  ^  min(l,X./{(y,i)T  Qr'yi  }~1/2)  (53) 

Then,  we  can  express  the  following  lemma,  whose  proof  is  in  the  Appendix. 

Lemma  3 

Let  p  be  some  absolutely  continuous  observation  process.  Given  /,  let  f[y\ )  denote  the  density 
function  of  this  process,  at  the  vector  point  y\ .  For, 

=  2"5  J  dy'^y', )[ l-g(y'i )]g(yi )[-2-KJ,2+o/_2+pr2(l-KJ?)m2(y,1 )] 

R' 

-[Inoi]  Jdy/l/(y/,)[l-g(y,1)]  (54) 

R' 

the  differential  entropies  Fyv/*)  and  fy ,$,*(v/*)  arc  bounded  from  above  as  follows: 

iyv,*)<2-'[l  +/n27Tpr]  +  BM(v,*)  (55) 


fy;.,(v,-)<2  i\\+ln2Kp}\+(\-(,)B^(v,*)  + 


+  Q'1  g(zl‘)[l-g(zl‘)][-2+of  +  ot2  +  pi2(l+a})z2ml(l1)] 
-ai-g(zl')]lnO/ 

For  I  z  I  — »«,  we  find  a  tighter  bound  on  H^zfV/*),  as  follows: 

/im  H „  r  2  (V/*)  =  (l-OHp.fv,*)  -  C|dx/o(x)ln/'0(x)  < 

Izl  -MO  £ 

<  ( 1 -02-1  [  l+ln2?rp  J  ]  +  ^2~x  [  I +ln2 kt20  ]  +  ( 1  -QB^  (v,» *) 
=  2“‘  [1  +  ln2npf  ]  +  (K)Bn.(v,*)  -  Clno, 


We  note  that  the  differential  entropy  ^(v“)  induced  asymptotically,  (M— »«>),  at  the  nominal  source 
by  the  optimal  at  the  nominal  predictive  operation  v°  is  bounded  as  follows: 


HM.(v/°)  =  2-1n+ln2rtpf] 


H^,z(v?)  =  2-1[l+ln2rtpf]  ;  ¥^,z  (59) 

We  point  out  that  when  the  nominal  Gaussian  source  is  k-order  Markov,  then  we  select  l=k,  and  we 
deploy  the  predictive  operation  vk*  in  Theorem  1,  for /=k. 
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D 


V'.-v.-xv^  ■ 


we 


mm 


V.  ASYMPTOTICALLY  LONG  OBSERVATION  SEQUENCES 


In  this  section,  we  consider  the  same  outlier  model  and  the  same  Guassian  source,  as  in  section  IV, 
but  we  include  asymptotically  long  observation  sequences.  In  the  presence  of  such  sequences,  the  precise 
modelling  of  the  observation  processes  that  evolve  from  the  outlier  model  in  (1)  and  (2)  is  an  impossible 
task.  On  the  other  hand,  enlargements  of  the  class  of  observation  processes,  as  those  in  (17),  misrepresent 
the  actual  class  when  long  observation  sequences  are  considered.  In  fact,  when  the  length  /  of  the 
observation  sequences  tends  to  infinity,  the  class  F%  in  07)  represents  the  case  where  the  observation 
process  is  the  nominal  source,  with  probability  (1-5),  and  it  is  some  other  process,  with  probability  5;  that 
is,  no  data  mixing  is  then  included,  and  the  outlier  model  is  not  then  a  member  of  the  class.  For  non 
Markovian  Gaussian  nominal  source,  and  asymptotically  long  observation  sequences,  we  thus  extend  the 
predictive  operations  of  section  IV  adhocly,  but  in  an  intuitively  satisfactory  fashion. 

Given  /  finite,  given  k,  given  the  observation  sequence  y\‘,  and  for  Qt  denoting  the  /-dimensional 
autocovariancc  matrix  of  the  nominal  Gaussian  source,  let  us  define. 


Ql' y{:it\)lVa  ;  o<i<k-i 
For  as  in  (23)  in  Theorem  1,  let  us  also  define. 


(60) 


(yi')^rnin 


1. 


h 


My*') 


zi'Cyrf  T  =  . z£_iytl(y  Y) 

L  L  J 


(61) 


Let  us  now  consider  the  following  two  mapping  densities,  that  map  the  observation  sequence  y  ] 
onto  the  real  line,  for  predictive  encoding  of  the  datum  Xk;+1  from  the  nominal  source: 


VI 


q'lx.y1^  X  X  II  min!  I. 


k-i 


f ,  h  y 

k-i  1 

“ 

l  '  av/tyV), 

II 

1 -in  in 

j-nul 

_  V. 

<>  (i,.l5jlin)  > 

•l/o(x!  1-J-m)  -/„(x)l  +-/„<x) 


1. 


X, 


V! 


(62) 
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q*(x.yV)f 


k-1 

1-k"1  Y  m'n  '• 
j=o 


h 


aj./Cy  i7J 


/o(x) 


k-1 

+  k  !  Y  min  1. 

j-o 


/oCxlz^Cyf)) 


(63) 


The  mapping  density  in  (62)  is  an  intuitively  pleasing  extension  of  the  operation  v,*  in  Theorem  1, 
but  very  complex,  both  in  terms  of  implementation  and  in  terms  of  analysis.  In  addition,  it  docs  not 
provide  a  clear  indication  as  to  the  mapping  values,  when  their  number  M  is  finite.  The  mapping  density 
in  (63)  is  much  simpler.  It  also  has  intuitively  pleasing  characteristics  as  well:  For  X/— it  converges  to 
the  optimal  at  the  nominal  source  mapping.  It  also  disregards  extreme  data  values,  using  the 


k-1 

unconditional  density  /Q(x)  in  its  mapping,  when  k"1  Y  m’n 

i=0 


k  / 


a,  .Kyi 


’  0.  In  addition,  q*(x,>  ^) 


provides  easy  extensions  of  the  mapping  values  in  (25),  when  M  is  finite.  In  conclusion,  we  propose  the 
following  predictive  encoding  scheme  for  non  Markovian  Gaussian  nominal  sources,  and  arbitranly  long 
observation  sequences: 

Encoding  Scheme 


Gven  M,  select  a  set  M,,l<i<M}  of  intervals  on  the  real  line  with  =  () ;  ¥i*j, 

t^j  <4j  =  R,  and  J/C(x)dx  =  M~' ,  Vi. 

l<i<M  A, 


Select  some  finite  natural  number  /,  and  given  8  :  0<8<1,  find  the  positive  constant  as  in  (23). 
Then,  given  k,  given  an  observation  sequence  y\‘,  map  yf  onto  Vj*  with  probability  q*,(y ^), 
where  for  Po.fy™)  as  in  (22),  for  zf(yj')  as  in  (61),  and  for  a;>;(yV)  as  in  (60  the  values 
(Vj*,  l<i<M)  and  the  probabilities  q,*(y^)  arc  as  follows: 


q,*(v^)  =  M-1 


k-1 


I-k  Y  m*n 

j=<> 


k-1 

+  k-1  Y  min 

J=<> 


*/  1 
w\‘\ 


.w,vw3 

*-\  <y i  j 


(6-4) 
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Vl*  =  M(l-5)  JdyfVoCyiWy^q, *(>.'> 

Ru 

Remarks  Given  /,  given  length  k /  of  observation  sequences,  we  will  denote  die  above  encoding 
scheme  V/k*.  We  will  denote  by  (v/k*,  k>l }  the  sequence  of  encoders  evolving  Irom  v,  k*.  for 
varying  k  values.  We  note  that  the  scheme  utilizes  /-size  disjoint  blocks  of  observed  data,  where  / 
may  be  considered  as  a  design  parameter.  In  addition,  it  bounds  disjoint  /-size  blocks  ot  data  in 
Po.(4'(yi')).  for  ad  i-  This  is  in  contrast  to  the  scheme  in  section  IV,  and  is  needed  for 
asymptotic,  (k — ><=«),  qualitative  robustness. 

We  now  express  a  lemma,  whose  proof  is  in  the  Appendix. 

Lemma  4 

Let  {b,m}  be  the  one  step  prediction  coefficients  of  the  nominal  Gaussian  source,  when  ni¬ 
si 

size  observation  sequences  arc  given.  Let  (b,m }  be  such  that,  £  1  b,m  I  <  c*  <  »:  Vm.  Then,  the 

1=1 

sequence  (v/k*,k>l)  of  predictive  encoders  is  qualitatively  robust  at  the  nominal  Gaussian 
source.  That  is,  it  satisfies  both  continuity  conditions  (A)  and  (B)  in  section  II 


Let  D /  k(/o,  V/  k * )  denote  the  mean  difference-squared  distortion  induced  by  the  encoding 
scheme  v,k*.  at  the  nominal  Gaussian  source.  Let  D,k(/"0,£.z,V/k*)  denote  the  mean  difference- 
squared  distortion  induced  by  v/k*.  when  the  /-dimensional  observation  density  is  such  that, 

”  ^  r 

f  yjKjv  =(l~0/o  y^+{ >;  +£5(/.I/),  and  let  then  £/k*  be  the  breakdown  point  of  v,k*.  Given 
M.  we  then  easily  find  by  substitution,  and  as  in  section  IV: 


D/klA..  va*)  =  !vJX;l  -(1-6)M 


m  r  1 

I  i  f  yi'/u(yV)/«o(yi,)q,*(yn 

■=•  L«u  J 


2-(l-5)M  j  dyf/0(yf)qi*(y^) 

Ru 


D/,i(/ro.C.z.  V/.i*)-D/il(/0,v/jl*)  = 

m  r  2 

=  C(1-S)M  Z  J  dyi/o(yi)m0(y,1)qi*(y,1)  [2-<l-5)M  J  dyi/oCy'^q.^yi) 


+  M(l-5)qi*(zI/)] 


C/. l  *  =  { 1-K1— 5)  X(Qoi*)2  Z(Qoi*)2[2-(l-5)M  Jdy/i/0(y,1)q1*(y,1)]  }' 

i=l  li=l  R' 


;  where, 


Qoi*^  J  dyi/o(yi)'«o(yi)q.*(yi) 

R' 


Let  us  define,  for  {a^/fyf )}  as  in  (60)  and  zi^yV)  as  in  (61), 


Ux  A  i  u-l 


q*(x.yi)_  1-k  1  £  min  1, — /D(x) 

■  L  j=o  L  aJ./(y  i  >JJ 


+  k  £min  1. — /o(xlzi(yi  )) 

i=o  l  aj./(yi  >J 


Then,  if  I/k  /0,z,V/ik*j  denotes  the  influence  function  of  the  operation  v,  k*,  and  in  parallel 

to  the  expressions  (42),  (43),  and  (48)  in  section  IV,  we  find  the  following  asymptotic,  (M-»°°), 
expressions: 


/im  D/ k(/"OIV/ik*)  =  {X2}  -2(1-8)  j  dx^'(x)  J  dy'j,/0(yV)m0(yV)q*(x.y‘ 

”  R  Ru 


+  (l-5)2  Jdx/;2(x)  Jdy^/0(yf,)q*(x.yi/)  JdyV/o(yiVi0(y  i')q*(M’i')  (71) 

R  Ru  R“ 


HmnwmmwwMiiawwuiMM  mm 


immmnnm 


Remarks  The  asymptotic  expressions  in  (72)  and  (73)  correspond  to  /-size  observation  blocks 
and  asymptotically  many  mapping  values  {v,*}.  For  /-order  Markov  nominal  Gaussian  sources, 
those  expressions  represent  the  asymptotic,  (M-»°o),  influence  function  and  breakdown  point 
induced  by  the  encoding  scheme  (v,k*)  at  the  nominal  source,  for  any  k.  Comparing 
expressions  (71),  (72),  and  (73),  with  expressions  (42),  (43),  and  (48),  in  section  IV,  we  can  draw 
the  following  conclusions: 

The  encoding  scheme  in  Theorem  1  induces  smaller  mean  difference-squared  distortion  at  the 
nominal  source,  than  the  scheme  V/j*  does.  However,  the  breakdown  point  of  the  former  is 
generally  smaller  than  the  breakdown  point  of  the  latter.  The  influence  function  of  vt  l*  is 
bounded,  and  it  converges  to  its  bound  slower  than  the  scheme  in  Theorem  1  docs. 

If  H^(v,  i*)  denotes  the  differential  entropy  induced  by  the  scheme  v,  j*  at  the  nominal  source,  and  for 
B^(V/*)  as  in  (54),  g(y,1)  as  in  (53),  and  p;  as  in  (38),  we  find  via  methods  as  those  in  the  proof  of  Lemma 
3: 


H^(vu*)<2 


-l 


l+ZnZxpf 


+  Bu  „<v,*>  + 


(74) 


+  2-‘  J dyi/o(y  l )[  l-g(yi  )]g(y'i  )[mfoi  (yi ))  -  ml( yi )] 

R' 

From  the  results  in  Lemma  3,  in  conjunction  with  (74),  we  conclude: 

The  scheme  v/  i*  induces  lower  differential  entropy  at  the  nominal 
source,  than  the  scheme  in  Theorem  1  does. 

Limiting  Behavior 

The  sequence  {v/k*,k>l}  of  encoders  in  this  section  was  designed  especially  for  non-Markovian 
nominal  Gaussian  sources,  and  asympotically  long  observation  sequences.  Thus,  the  study  of  its 
performance  characteristics  for  k— is  important.  We  will  perform  such  studies,  for  the  case  where  the 
mapping  values  {v;*}  are  asymptotically  many;  that  is,  for  M— »g°.  Wc  first  express  a  theorem,  whose 
proof  is  in  the  Appendix. 

Theorem  2 

The  influence  function  /im  I/tk(/o,z,V/k*)  is  uniformly  bounded  from  above,  for  every  z  and  every 

M— *<» 

k.  The  breakdown  point  /im  £/k*  is  uniformly  bounded  from  below  by  a  strictly  positive  constant,  for 

M— x»> 

every  k. 


In  view  of  Theorem  2,  we  remind  the  reader  that  the  optimal  at  the  nominal  source  predictive 
encoding  operation  induces  asymptotic,  (M-»«>),  breakdown  point  equal  to  zero,  and  unbounded 
quadratic  asymptotic,  (M— »°°),  influence  function,  for  every  dimcnsionalit^f  the  observation  sequence. 
As  k  increases,  the  asymptotic,  (M— »°°),  mean  difference-squared  distortion  induced  by  the  sequence 
{v/,k*}  of  encoders  at  the  nominal  source,  decreases  monotonically,  but  remains  uniformly  higher  than 
that  induced  by  the  optimal  at  the  nominal  sequence  of  predictive  encoders.  Given  k.  the  former  is  given 
by  expression  (71),  where  the  latter  is  given  by  expression  (39)  in  section  IV.  Let  IIK(v/k*)  denote  the 
differential  entropy  induced  by  the  encoding  scheme  v/k*  at  the  nominal  source.  Then,  wc  express  a 
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lemma,  whose  proof  is  in  the  Appendix.  Forp/  as  in  (38)  and  r0  and  a,  as  in  (53),  we  first  define, 


Gk,/(yiZ)  ~  k  1  Zmin  1,  — 

j=o  aj,/(y i ) 


Lemma  5 


For  g(y  i )  as  in  (53),  and  for. 


D(v,.k*)  t  j  ( <iy\Ifo(y\,)Gkj(y)1)[  1  -  GM(yf  j  [otf+og, 

+  r52(l+CTk?)/n§(zV(yV)) 

-  /nawj  I"  1  -  j'  dy  t/o(y i  )g(y  t )[ 


The  differential  entropy  H^(v,ik*)  is  bounded  as  follows: 


Hn„(v/,k*)  $  y  1  +  /n2;tpk/  +  D(v/  k*)  (77) 

{bin,}  are  the  one  step  prediction  coefficients  of  the  nominal  Gaussian  source  when  m-sizc 

M 

observation  sequences  are  given,  and  if  £  Ib^l  <  ~,  Vm,  then  there  exists  c,*  <  such  that, 

i=t 

•mo(zV(y^))  I  ^  Then,  we  find  a  looser  upper  bound  on  H^(V/  k*),  as  follows: 


Hn„(v,ik*)  £  y[l  +  /n2tcp^  +  C(v/ik*) 


;  where 


t  T  Okf  +oc,  +  r02(l+oj-r)X/2(c,*)2  -i  J  )g(y/, )  - 

“  J  L  K' 

-  I  dy^/0(yf)Gk2,(yv]  -  ["/nerj  1-f  dyi/0(y',  )g(y', ) 


Remarks  The  differential  entropy  H^(v;k*)  decreases  monotonically  with  increasing  k,  and  remains 
strictly  higher  than  the  differential  entropy  induced  by  the  optimal  at  the  nominal  predictive  encoder,  at 

i  r  ,1 

the  nominal  source,  (given  k,  the  latter  equals  —  1  +  /n2?tpy  ).  In  the  {  italic  1  im  }  it,  (k-»°°),  the  bound 
in  (78)  can  be  as  small  as  the  asymptotic  mean-squared  error,  /im  of  the  optimal  at  the  nominal  source 

n— >«» 


one-step  predictor  allows.  This  depends  on  the  spectral  characteristics  of  the  nominal  Gaussian  source. 


VI.  CONCLUSIONS 


In  this  paper,  we  considered  predictive  encoders  with  distortion,  for  entropy  reduction.  We 
considered  a  stationary  and  Gaussian  nominal  source  and  we  designed  and  analyzed  qualitatively  robust 
predictive  encoders,  for  resistance  to  data  outliers.  Our  encoders  offer  protection  against  outlier  values,  at 
the  expense  of  increased  distortion  and  differential  entropy,  at  the  nominal  source. 


APPENDIX 


Proof  of  Theorem  1 

Let  H-i  and  ji2  be  given,  and  let  f\  and  /2  denote  their  corresponding  densities.  Let 
fn„ji<Xyi)  denote  joint  density  of  the  datum  Xm+1  from  the  nominal  process,  at  the  point  x,  and 
the  random  vector  from  the  observation  process  at  the  vector  point  y', .  Then,  from  class  Fl5 
we  conclude: 


/^(x.y'i )  =  (l-Sj/ofx.yi )  +  5/0(x)  f2(y\ )  -  (l-5)/0(y \  j 


(A.l) 


/ji0.(ij(xiyi)  = 


/^(x.yj) 

fiiy'i) 


0-S)/o(yi)~ 

f2(yi) 


/o(x)  + 


+ 


(i-5)/o(yi) 

fiiy'i) 


/o(xlyi) 


(A. 2) 


p..2(y  1 )  t  <x  1  y'i  >dx  =  m-1 

A, 


(i-5)/~o(yi) 

/2(yi) 


t  n-6i/»(yi) 

f2(y') 


pol(yi) 


(A. 3) 


Let  us  define. 


b{l,2)  =  ( 1-6)  Jdy^i  (y',  )p,,2(y \ ) 


Jdy  i/o(y  1  )mo(yfi  )p,,2(y,i ) 


Then,  we  easily  find, 


(A.4) 


D, 


Hi.p2.(b['-2))1  =  EK{X2)-(1-8)2  I  [jdy'v/i(y,,)p,2(y,1) 

J  1=1  *- 


-1 


< 


z[jdy,i/i(y/i)Pi.2(y^  '[dyi/oCy'iM^y'OPuCy') 


^  |  Sjdy  1/1  (y'i  )Pi,2  (y i  j  x[ jdy i/o(y  i  >*o(y  i  )p,.2(yi  j 2  = 

=z[jdyi/o(y,i)mo(y/i)Pu(y,i) 2 


with  equality  in  (A. 6)  if/i(y‘i)  =  f2(y‘i)  ;  ■V  yi£R  .  From  (A.6)  and  (A.5)  we  conclude, 

D /  Hi,P2.{b[u>}]  <D,(p2.P2.{bp,2)})  = 

=  EmJX2}-(1-S)2M  ifjdyi/oCy'.^Cy'Op^Cy')!  2 

i=iL  J 


;  where. 


bP-2)  =  (1-8)M  }  dy'if0(y\)m0(y\)pit2(.y'i)  (A.8) 

Now,  supD,4i2.P2.{bp-2)))  corresponds  to  inf  M  £  [  jdy'i/o(y/1)m0(y'i)p,,2(y‘i)|  V 

Application  of  calculus  of  variation  gives  that/*  in  (23)  attains  the  latter  infimum.  The  proof  of 
the  theorem  is  now  complete. 

Proof  of  Lemma  1 

From  (24).  „e  have  Mm  Vi.  Also.  Um  R(d<).|<J  . 

Substituting  the  above  in  (30)  and  (31),  in  conjunction  with  (28)  and  (29),  we  find  that 
D,fo.(;,±».v?)<E^{X2};  V£  <,  and  D/(/'0,^,±«>,v/*)  <  E^  (X2 }; 

Proof  of  Lemma  2 

We  easily  conclude,  forq(x,y,t)  as  in  (41): 


q(x.yi)</0(x)+/0(xly,)) 


and  thus. 


I 


t- 


i 

\ 

5; 

S 

* 


2/o‘(x)  -  (I-«y;a(x)/dyi/0(ylI)q(x.y/I)  >  25 fo'(x) 

Also, 

/dy  i/o(y!  )"»o(y{  )q(x,yi )  < 
^/oCxlJdy'i/oCy', )  I  m0(y', )  I  min(  1  ,X,  { (y',  )TQ/_1  y }“l/2) 
+  /dy  i/o(y i )  I  m0(y { )  I  min(  1 , X,  { (y )TQJ- 1  y \ } 1  < a )fQ (x  I  y \ ) 
<  2  X,m/0(x) 

Applying  (A.9)  to  (48),  we  find, 


lim  £/*> 

M— »«> 
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1+5 


Applying  (A.  10)  to  (48)  and  (46),  we  find. 


<  (1+5)4^m?  {(l+5)4X?m,2  +(l-5)|dx^'(x) 


jdy/i/o(yi)m0(y,1)q( 

R' 


^/im  I/(^0,z,v/*)  <  (1 


3-5-(  1  -5)m  in(  1 ,  XtCt 1  z  I 


+  (1-S)2  4X?mfmin(l,A.,C/lzl  ') 


(l-5)2Jdx/^2(x) 

R 

Jdy,i/0(yi)q(x,y,I) 

.R' 

Jdy'i/o(yi)/n0(yi)q(x,y/1) 

.R' 

=  (1-5)(3-5)4X2w2 

(l-5)2|dx/^2(x) 

R 

Jdyi/o(yi)q(x,yi) 

.R' 

Jdy/i/0(y,|)m0(y,1)q(x.y,1) 

_R' 

Proof  of  Lemma  3 


Clearly,  forq(x.y', )  as  in  (41),  wc  have, 


-HM(v,*)>  Jdy  i/(y  i )Jdxq(x,y,t  )/ogq(x,y,) ) 


mrmrmmt 


BIBWIlWyiUWWWWWWW  W5V-r>r‘^^fc<VSJnttVWVW,'.ri^.f  v'.r  ^  •  w\’  - 


<] 


;  where. 


l* 


Jdxq(x,y/1)/ogq(x,y/i)  =  [ l-min(l,A|{(y{)TQ/1yir,/2)  }dx/0(x)togq(x,y/1) 

D  L  J  D 


+  min(l,X/,{(y'1)TQ/,y,,}-1/2)Jdx/0(xlyi)/ogq(x,yi) 


(A.  15) 


k 


Let  us  define. 


g(y  1 )  -  min(i,M(yi  )TQr'  y'l  }_1/2) 


r0  -  Eh  {X2} 


(A.  16) 


(A.  17) 


Then,  from  (A.  15)  and  the  convexity  of  the  logarithmic  function,  we  obtain: 

C(yi)  *  Jdxq(x,y'i)/ogq(x,y',)  =  [l-g(y/,)]Jdx/0(x)/ogq(x,y,!)  + 

R  R 

+  g(yi)Jdx/0(xly'l)/ogq(x,y,1)  > 

R 

^  l  l~g(y'i  )12  Jdx/0(x)/og/'0(x)  +  [l-g(y',  )]g(y'  )Jdx/0(x)/ogf0(x  I  y',) 

R  R 

+  n-g(yi  )]g(y'i )  Jdx/o(x  I  y',  )/ogf0(x)  +  g2(y  i )  fdx/0(x  I  y',  )logf0(\  I  y', ) 

R  R 

=  — 2_1  [  1— g(y  i  )]2 1 1  +/og2Ttr£)-2-tg2(y,,)[/og2jtp?  +  1] 

-2_Ig(y'i)[l-g(yi)l(/og2rtr§  +  /og27tp^  +  pf2[r§  +  m2(y',)]  +  r52[p?  +  m2(y',)l) 

=  -2_1  [  1  +  /og2;tp }  I  +  2'1  [  l-g(y'  )]/ogo?  - 

— 2-1  g(y  1  )l  1— gfy'i  )H— 1  +or  +  <V  +pr20  +o2)m2ty/|)l  (A. 18) 
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ft 
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n 
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c? 


I 

•V 
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;  where,  log  is  the  natural  logarithm  In,  and  where, 


-2  A  -2  _2 
<*l  _r0  Pi 


Substituting  (A.  18)  in  (A.  14),  we  find  (54).  Similarly,  we  find, 


-H^,(v/*)  >  (1-0  Jdy,l/0(y/1)/dxq(x,y,1)fogq(x.y,1)  • 


+  ^JdxqCx.zI^/ogqCx.zl')  > -(1-02  '[1  +  /og2;tp?]- 


(A.19) 


-{1- 0Bp,(V/*)  -  C2"1  [1  +  loglnpj]  +  C2-I[l-g(2I/)]/ogcr?  - 

g(zi;)[  i-g(2i/)][-i + or + or2 + pr2(i + o?)z2m2(i')i  (A.20) 

Proof  of  Lemma  4 

The  mapping  qi*(yi/)  is  clearly  pointwisc  continuous  for  every  Hnitc  k  and  every  i,  since 

h 


mm 


and  Poifyi')  are  both  pointwise  continuous,  for  every  i  and  j,  and  every  finite  k. 


I  aj./(yV)J 

Let  now  k  be  given,  and  let  then  xj;  and  y^  be  two  sequences  such  that 
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xBU* 


<  a,  0  <  a<  1,  fork(l-a)  of  the  k  i’s.  Then, 


lwto(z!7(yi,))-^o(zi/(xi/))l  <ac*  +  X,ac*  =  ac*(l +\,) 
and  given,  x*'.  given  Ei  >  0,  there  exists  ot]  >  0,  such  that, 

(#j  :  Y/(yj j+!y.  *£F)  >  a, }  <  ka,  implies, 
IPo.(zi,(yi,))-p«.(zi'(X|,))l  <6i  ;  Vi  (A. 2 1 ) 


Similarly,  given  x1}',  given  £2  >  0,  there  exists  S2  >  0  and  53  >  0,  such  that. 


Y;[  xjj+P.  yjj+p]  <52  implies  I  min  1, - —  -min  1,  X/overcayfyn  I  <  e2  (A.22) 
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{#j  :  Yf(xj9+iy>  yjl+i0')  <  k53  and  Y/[x^ly,  yj£i!)/]  <  53 
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imply,  Imin  1,  — 7^-77-  PoiCz^x^)) -  min  1,  - ‘-rr-  PoiCzVCyV))!  <£2  (A.23) 

ay(x^)J  ay(yf)J 


Given  x*7.  let  now  y f  be  such  that: 


{#j  :  Y/(yj)+iiy,  xj/:,Iy)  >  a}  <  ka,  for  some  a  such  that  a  <  min(52,53) 


lq1*(xf/)-q1*(yi')l  Sk'1  £  lmin  !. — TUT  ~min  1. — 7^77  1 

j=o  aj,/(x  )J  (.  ay(y,  )_ 
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+  k_1  X  lmin  1,  — -  Poi(ziW))  -  min  1,  — Poi(z i'(y  1O) I  ^ 
i=o  ^  aj,/(xi  )J  [  aj  /(yi ) 

<  2(l-a)e2  +  2a 


(A. 24) 


From  (A.24)  we  finally  conclude  that  given  xf7,  given  e  >  0,  there  exists  a  >  0,  such  that, 


(#j  :  Y/Cy^i0'.  x}i+i)!)  >  a}  <  ka  implies 


Iqj^x^-qi^yf)!  <£  ;  Vi. 


Proof  of  Theorem  2 


Let  us  define, 


Then, 


r  f  X  T  f  X  1 

qJ*(x.yi/)_  1 — min  1, - ^77-  /0(x)  +  min  1, - /„(x  I  zV(yV)) 
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(A. 25) 


(A.26) 


qJ*(x,yi/)  =  k  1  £  qj(x.y^) 

j=o 


If  D/,k(£v/,k*)  denotes  the  mean  difference-squared  distortion  induced  by  v/k*  at  the 
density/,  then. 


/im  £>/,k(/‘.V/  k*)  =  k  1 
M— 


X  ^V.v/.k*) 
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(A.27) 


;  where. 


0?V,V/,k*)  *  {X2 }  -  2(l-S)Jdx/^,(x) 


f  udyi,/0(yi/)m0(yi/)q*(x,yi/) 


•  J  dy  h/(yV)qJ*(x,yi,)|x^x  I  yf')dx 
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(A.2S) 

Due  to  (A.27),  we  conclude  that  the  influence  function  induced  by  v;k*  is  the  average  of 
the  influence  functions  induced  by  the  operations  {qj*,  0<j<k-l}.  Also,  if  p;j*  denotes  the 
breakdown  point  of  the  operation  qj*,  then  the  breakdown  point  of  v,  k*  is  bounded  from  below 
by  min  p/  ;*.  From  (A.28),  and  due  to  the  boundness  of  the  vector,  zV(y^),  we  now  conclude 
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(l-5)2/dx/:2(x) 


j/o(yki‘)m0(yYq*(x,yY) 
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that  there  exists  some  positive  constant,  d*,  such  that  p/j*  >  d*  /im  £/it*  ;  Vj.  If  1/ j(/o,z,qj*) 

M-x»  '  '  ’  J 


denotes  the  influence  function  of  the  operation  qj*  at  the  nominal  source,  we  also  conclude  that 
there  exists  some  finite  constant,  e*,  such  that,  //,j</0,z,q.*)  <  e*  /im  If.il/o.z.V/  i*).  The 
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Theorem  easily  follows  from  the  above,  where  /im  I/,i</0,z,V/ 1*)  is  given  by  (72)  and  /im  £;>1* 

M— ’  1  M— 

is  given  by  (73). 

Proof  of  Lemma  5 
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For  q*(x,y  f)  as  in  (70),  we  clearly  have. 


-H ^(v/.k*)  >  J  dyP/0(y?7)Jdxq*(x,yfi)/nq*(x,yV) 

Ru  R 
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Forq^Cx.y^)  as  in  (A. 25),  and  due  to  the  convexity  of  the  logarithmic  function,  we  have. 
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Jdxq*(x,yf)/nq*(x,yf)  =  k  1  £  Jdxqj*(x,y^)/nq*(x,yi/)  > 
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+  min  1, 
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Jdx/0(x  I  z^(y  i'))/n/0(x  I  z  i'(y  i7)). 
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Let  us  define. 


(A. 


gi,/(y  I7)  t  min  1, 
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Then,  from  (A.31),  and  for  ro  and  pm  as  in  the  proof  of  Lemma  3,  wc  obtain: 


Jdxqj*(x,y^/)/nqi*(x,y^)  >  -^{gi.Kyftgj.Kyft  +  [l-g^y^lEl-gj  ,(yk/)]) 
R  2 

— 1  ~ gi./(y  i7) } /n2Tcrg  -  ygi,/(yi,)/n2rtp^  - 

-tV[fo  +  ^oCzi'Cy  l'Wlgi.Ky  i')  -  tt^p  ^  +  ml(z\l  (yi/))]gj  ,/(yi') 
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+  ^~r  +  ^T+  ~T  +  “T  mo(zi/(yi/))}gi,/(yi/)gj./(yi/) 

2  Pw  ro  [  Pu 

Define, 

Gk,;(y  t7)  ^  k_1  2*  gj./(y  i7) 

j=0 

From  (A.30)  and  (A.33),  we  then  obtain: 


Jq*(x,y!i7)/nq*(x,y^)  >  -y  G^(yf)  -  y  [  l-Gk./(yi^  2 
-  y  [  I-Gk./Cyi7)  /n2nr§  -  y  Gk>/(y^)/n2icp^, 

-  y[oji2  +pu2m^(zl[,(y^  GM(yf) 

-  y  [o2/  +  Gk  ;(y  i^) 

+  y[ok2  +  o?,  +  pk/2(l  +a^)m2(zV(yV))]Gc,/(yf) 
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=  -  —[l  +/n2rep^]  +  [1  -GM(yi/)]/nak/ 

-  yGk>;(y^)[l-Gk  ;(yV)][Ok/2  +  ah  +  r52(l  +  ok?)m2(z^(y^))  -  2] 

;  where, 
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Ou  _  ro  Pu 

Applying  (A.35)  to  (A.29)  we  obtain  the  result. 
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